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The most precise measurements of the top quark mass are based on the Matrix Element method. 
We present a detailed description of this analysis method, taking the measurements of the top quark 
mass in final states with one and two charged leptons as concrete examples. In addition, we show how 
the Matrix Element method is suitable to reduce the dominant systematic uncertainties related to 
detector effects, by treating the absolute energy scales for fe-quark and light-quark jets independently 
as free parameters in a simultaneous fit together with the top quark mass. While the determination 
of the light-quark jet energy scale has already been applied in several recent measurements, the 
separate determination of the absolute 6-quark jet energy scale is a novel technique with the prospect 
of reducing the overall uncertainty on the top quark mass in the final measurements at the Tevatron 
and in analyses at the LHC experiments. The procedure is tested on Monte Carlo generated events 
with a realistic detector resolution. 
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I. INTRODUCTION 

The Matrix Element method is unique among the analysis methods used in experimental particle physics because of 
the direct link it establishes between theory and event reconstruction. Originally developed to minimize the statistical 
uncertainty in measurements of tt events at the Tevatron experiments DO and CDF [l[ , it has since been applied with 
great success to measurements of the top quark mass nit ^ and also in the discovery of electroweak production of 
single top quarks Q. The method can in principle be used for any measurement, with the largest gain compared to cut- 
based analysis techniques expected for processes involving intermediate resonances and leading to many-particle final 
states. In general, the Matrix Element method can be used to determine several unknown parameters (theoretical 
parameters describing the physics processes measured as well as experimental parameters describing the detector 
response) at the same time in one measurement, thus also allowing for a reduction of systematic uncertainties. This 
paper presents the analysis method in general and also gives an example of how the determination of such additional 
parameters can be implemented. 

Recent measurements in ti events containing one leptonic and one hadronic W decay ("lepton-|-jets events") already 
exploit the known W mass to constrain the energy scale for light-quark jets and significantly reduce the main systematic 
uncertainty of early measurements of the top quark mass. Among the largest remaining systematic uncertainties is the 
uncertainty on potential differences between the energy scales Sb and Si for b- and light-quark jets^. Without further 
improvements, it will soon become a limiting uncertainty for those measurements that dominate the world-average 
value In this paper we show how together with the top quark mass, a simultaneous additional measurement of the 
6-quark jet energy scale, which was first proposed in [sj for the lepton-|-jets channel, can be incorporated naturally 
in the Matrix Element technique - not only for measurements in the lepton-fjets channel, but also in tt events with 
two leptonic W decays ("dilepton events") where the quantities to be measured cannot be reconstructed based on 
the kinematical information of a single event alone. This study considers the case of the Tevatron (proton-antiproton 
collisions at a center-of-mass energy of 1.96 TeV) as a concrete example but is applicable to the two LHC experiments 
ATLAS and CMS as weU. 

The paper is structured as follows: In Section [III an overview of the Matrix Element method is given. Section Hill 
discusses the generation and selection of tt events used for the studies described in the further sections. This is 
followed by a discussion of the implementation of the likelihood calculation for signal and background processes for tt 
measurements in the lepton-|-jets and dilepton channels in Section IIVI We then describe studies of the performance 
of the new measurement technique, separately for lepton-f jets (Section |V| and dilepton (Section IVI|) tt events. In 
Section IVIIl systematic uncertainties on rrit are addressed, with an emphasis on the effect of events with significant 
initial- and final-state radiation. Section [Villi summarizes the findings and gives an outlook. 

II. THE MATRIX ELEMENT METHOD 

The Matrix Element method is based on the likelihood Lsampio to observe a sample of selected events in the detector. 
The likelihood is obtained directly from the theory prediction for the differential cross-sections of the relevant processes 
and the detector resolution and is calculated as a function of the assumed values for each of the parameters to be 
measured. The minimization of — In Lsampic yields the measurement of the parameters, where the likelihood ^sample 
for the entire event sample is computed as the product of likelihoods to observe each individual event. This is in 
contrast to most analysis methods used in experimental particle physics, where distributions from observed events in 
the detector are compared with corresponding distributions obtained from simulated events that have been generated 
according to theory and then passed through a detector simulation and the same event reconstruction software. 

This paper concentrates on the case of the measurement of the top quark mass in lepton-l-jets and dilepton events 
at a hadron collider, where the parameters to be measured are the top quark mass rrit, factors Sb and Si describing 
the energy scales for b- and light-quark jets relative to the default energy scale^, and the fraction /jf of signal events 
in the channel under consideration. A comparison of the Matrix Element method with other methods to measure the 
top quark mass can be found in 



^ The quantities St and 5; denote scale factors relative to the nominal detector calibration. In the definition of these factors, it is assumed 
that all experimental corrections to measured jet energies have been applied to the events that enter the analysis, such that St and 
Si do not depend on quantities like the jet direction or energy. Uncertainties on such dependencies give rise to additional systematic 
uncertainties of the measurement e. g. of the top quark mass. 

^ In dilepton tt events, only 6-quark jets occur and thus only St is determined. Only one overall scale factor St is determined for all b 
jets; thus for example no distinction is made between b jets with and without a reconstructed muon inside the jet. 
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A. The Event Likelihood 

The sample likelihood Lsampie for N measured events to have measured properties xi, ...,xn can be written as 

N 

-^samplc(*^l ; •■! -^A^! 

i=l 

where the symbol a denotes assumed values of the physics parameters to be measured, (3 stands for parameters 
describing the detector response that are to be determined, and / is defined below. The likelihood Lcvtixi] d,(3,f) 
to observe event Xi under the assumption of parameter values a, /?, and / is given as the linear combination 

Lcvt{xi; d,P,f) = ^ fpLp{xi; d,f3) , (2) 

processes P 

where the sum is over all individual processes P that could have led to the observed event Xi, Lp(xi; a, 13) is the 
likelihood to observe this event under the assumption that it was produced via process P, and fp denotes the fraction 
of events from process P in the entire event sample, with fp = 1. In total, the physics parameters a, the detector 
response described by /?, and the event fractions / are to be determined simultaneously from the minimization of 

Iri i/ganiplc • 

The likelihood Lp{x; d,f3) in turn is given by the theoretical description of the process and the resolution of the 
concrete experiment, and is computed as the convolution of the differential partonic cross section with the parton 
distribution functions of the colliding hadrons and with the detector response. To make the likelihood calculation 
manageable, simplifying assumptions are introduced. This concerns the description of both the detector response 
and the physics processes P, where only the dominant ones are considered explicitly, and where the effects of parton 
shower and hadronization are accounted for with a simple parametrization. Because of these simplifications, the 
technique has to be calibrated using fully simulated events before applying it in an actual measurement on data. In 
this paper, a conceptual study is presented. We show how the method can be validated with events that have been 
generated under the same assumptions as made in the likelihood calculation, which allows to demonstrate that the 
measurement method as such is unbiased. 

In Equation not all likelihoods Lp necessarily depend on all parameter values; e.g. for the measurement of 
the top quark mass, the likelihood Lu depends on the assumed top quark mass, while the likelihoods for an event to 
be produced via a background process (which by definition does not involve top quark production or decay) do not. 
Even if a likelihood does depend on a certain parameter, this dependency does not necessarily have to be taken into 
account explicitly; for example, for the top quark mass measurement it will be shown in Sections fVl and IVll that the 
dependency of the likelihoods for background processes on the jet energy scales can be neglected without introducing 
large biases on the top quark mass and energy scale measurements. 



B. The Likelihood for one Process 



The individual contributions to the likelihood for an observed event x to be produced via a given process P are 
described in this section. They are visualized schematically in Figure [T] for the example of a lepton+jets tt event 
at the Tevatron. The observed event x, shown at the right, is fixed while integrating over all possible momentum 
configurations y of final-state particles. The differential cross section for the process is convoluted with the probability 
for the final-state partons to yield the observed event (transfer function) , and with the probability to find initial-state 
partons of given flavor and momenta inside the colliding proton and antiproton (parton distribution function). All 
possible assignments of final-state particles to measured objects in the detector are considered by the transfer function. 
For each partonic final state under consideration, the initial-state parton momenta are determined by energy and 
momentum conservation. 

The likelihood for a final state with nf partons and given four- momenta y to be produced in the hard-scattering 
process is proportional to the differential cross section dap of the corresponding process, given by 

/ (27r)'*|^p (aia2 — z/; (3)1^ ,^ 



where 0102 and y stand for the kinematic variables of the partonic initial and final states, respectively. The symbol 
^p denotes the matrix element for this process, s is the center-of-mass energy squared of the collider, ^1 and 6 
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FIG. 1: Schematic representation of the calculation of the likelihood to obtain a given observed lepton+jets event at a proton- 
antiproton collider (similar figures apply to dilepton events and to other processes). 



the momentum fractions of the colliding partons oi and 02 (which are assumed to be massless) within the colliding 
proton and antiproton"^, and d$„j: is an element of ri/-body phase space. 

To obtain the differential cross section dop {pp — )■ y; d?) in pp collisions, the differential cross section from Equation ^ 
is convoluted with the parton density functions (PDF) and summed over all possible flavor compositions of the colliding 
partons, 

d(Tp(pp ^ y; a) = j ^ dCidC2 /pbpte) /pDpfe) dap{aia2 ^ y, d) , (4) 

where /pdf(^i) /pdf(?2) denote the probability densities to find a parton of given flavor ai and momentum 
fraction ^1 in the proton and one of flavor 02 and momentum fraction ^2 in the antiproton, respectively. This equation 
reflects QCD factorization 

The finite detector resolution is taken into account via a convolution with a transfer function W{x, y; /3) that 
describes the probability to reconstruct a partonic final state y as a; in the detector, given the values (3 of the 
parameters describing the detector response. The differential cross section to observe a given reconstructed event x 
then becomes 

dap{pp^ x; a, (3) = j dap{pp ^ y; a) W{x,y; (5) . (5) 
y 

Only events that are inside the detector acceptance and that pass the trigger conditions and offline event selection 
are used in the measurement. To obtain a properly normalized likelihood, the overall cross section of events observable 
in the detector, 

af^d, /3) - / Aap{pp ^ y- a) W{x, y; /3) /acc(x) dx , (6) 



is used, where /acc = 1 for selected events and /acc = otherwise. One then obtains 

df7p(a;; d, (3) 



Lp{x; d,/3) dx = — '— (7) 



as the (differential) likelihood that an event produced via process P has measured properties x (and not other 
properties that would still lead to an event passing the event selection criteria). 



^ This discussion is based on the situation at the Tevatron pp collider as a concrete example but is equally valid for the LHC when the 
antiproton is replaced with a proton and the appropriate PDF is used. 
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C. Description of the Detector Response 



This section describes a parametrization of the transfer function W{x,y; /3) appropriate for measurements of high- 
Pt objects at a hadron coUider. The matrix element method is based on a fast parametrization that reproduces the 
basic properties of the detector. Any biases introduced can be determined (and then corrected for) when ensemble 
tests as described in Section [II Fl are performed with events generated with a full detector simulation, typically based 
on the GEANT package. This section is written with measurements in tt events in mind but is applicable to other 
processes as well. 

The transfer function W{x, y; /3) describes the probability density dP to reconstruct an assumed partonic final state 
y as a measurement x in the detector: 

dP = W{x, y; 0)dx . (8) 
Because the final-state partons are assumed to result in some measured event x, the normalization condition 

W{x,y; I3)dx = l (9) 



holds, where the integral is over all possible events x. Effects due to selection cuts or finite detector acceptance are 
discussed in Section Hi Dl 

The transfer function is assumed to factorize into contributions from each measured final-state particle. Aspects 
to be considered in the transfer function are in principle the measurement of the momentum of a particle (both of 
its energy and of its direction) as well as its identification. Thus 6-tagging information for the jets can be included, 
which can help to distinguish signal from background events. 

In many applications like the description of events, a number of assumptions can be made @ about how final-state 
particles are measured in the detector, such that the dimensionality of the integration over the final-state particle 
phase space described in Section IIIBI is reduced. Individual particles can be described in the transfer function as 
follows: 

• Isolated energetic electrons: Electrons are assumed to be unambiguously identified (i.e. an electron is not 
reconstructed as a muon or a jet). The electron direction and energy are both assumed to be well-measured, i.e. 
during integration, the final-state electron is assumed to be identical to the measured particle. This is justified 
for tt events since the resolution for electrons is far better than that for jets, and the jet energy resolution will 
dominate all effects due to the finite detector resolution. 

• Isolated energetic muons: As for electrons, muons are assumed to be unambiguously identified, and their 
directions to be precisely measured. However, instead of the energy the detector typically yields a measurement 
of (<z/pt)^, the muon charge divided by the transverse momentum. Consequently, the muon energy resolution 
can be poor for high-p^ muons, and thus a transfer function VF^ allowing for a finite resolution is introduced. 
In the studies presented in this paper, the function 

(id/pT), , i,/PT), ) = ^ exp 1^-- ^- ^ j j (10) 

is used to describe the likelihood that a muon with charge and momentum {q/pr)™^^ (described by the matrix 
element) is reconstructed with {q/pT f^'^- The resolution a depends on the pseudorapidity 77 to account for muon 
tracks at large \rj\ that do not reach the full radius of the tracking detector. The parameter values are taken 
from [3. 

• Energetic t leptons: Events with energetic t lepton decays are typically selected if the visible decay products 
pass a minimum energy cut. In this case, the directions of the visible decay products are close to that of the 
original r lepton, but only a fraction of the r energy can be measured in the detector. 

In this paper, only leptonic decays t — > tU^Vr are considered, where the symbol I denotes an electron or 
muon. Consequently, a transfer function WV {E^"^ / E'^^*^) is introduced to describe the likelihood to obtain a 
charged lepton with a given energy fraction Ei/Ej- of the decaying r lepton. For the study presented here, it is 
parametrized as a 3'''^-order polynomial as shown in Figure [5] The t direction is taken to be well approximated 
by the direction of the reconstructed charged lepton. 

For muonic r decays, the muon transfer function introduced above in principle has to be taken into account as 
well to describe the transition from the assumed to the reconstructed muon transverse momentum. However, 
muons from r decays typically have low enough transverse momenta so that the muon px can be assumed to 
be well-measured in most applications. In the following, the muon transfer function is omitted for muonic t 
decays. Also, in this study we consider the reconstruction efficiency to be independent of the lepton energy. 
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• Energetic quarks and gluons: The directions of final-state quarks and gluons are assumed to be weU- 
measured by the jet directions, and transfer functions are introduced for the jet energy measurement. The 
probabiUty density for a jet energy measurement EJ'^'^ in the detector if the true quark energy is E™^'^ (depending 
on the overaU jet energy scale Sb or Si ) is given by the jet energy transfer function Wjct {EJ'^'^ , E™'^^ , ; S^). 
In principle, different transfer functions apply to gluon jets and jets from different quark fiavors (f)™^^ . 
For 5*0 7^ 1, the jet transfer function is computed as 

W^jet(i?r' Ep\cpp'; S^) = , (11) 

where the factor in the denominator ensures the correct normalization in the absence of selection cuts. 
In this paper, the same jet energy transfer function is used to describe light-quark (u, d, s, and c) and gluon 
jets^; an independent transfer function is used for 6-quark jets. The parametrization of the transfer function 
follows that of the DO experiment given in Q, with parameters depending on the jet energy and pseudorapidity. 
In a fraction of those b jets that contain a semimuonic 6-hadron decay, the muon is identified, and these jets 
could in principle be described with a separate transfer function [8] (while the jets with unidentified semileptonic 
decays would still have to be described with one function together with all other b jets). In this paper, only one 
class of b jets is considered, because the focus is to show how an energy scale for b jets can be determined at all, 
and only one overall energy scale factor Sb is determined for b jets. Once this is achieved, it will be possible in 
principle to determine two independent energy scales for the different classes of reconstructed b jets. 
The ability of the detector to distinguish quarks from gluons and to identify the quark flavor is limited. Nev- 
ertheless, identification of 6-quarks (6-tagging) can be useful to distinguish signal and background events, or to 
identify the correct assignment of final-state quarks to measured jets in final states like lepton-|-jets ti events 
that contain both light and &-quark jets. In this paper, we follow the approach introduced in @ to include a 
term Wb in the transfer function which describes the likelihood for parton j with assumed flavor 0™'^' to be 
reconstructed with 6-tagging information If b tagging is used as a binary decision, then one simply has 

€b ((/"i"^*) if the jet j is ^-tagged and 

(12) 

1 — 66 {(j)"^^^) otherwise. 



Wb (sf ^ <pp') = 



* In events passing the tt selection cuts, gluon jets arise in background processes whose description is anyway only approximate; therefore 
no separate transfer function for gluon jets is introduced. 
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FIG. 3: The function Wb used to parametrize the 6-tagging performance. The likehhood Wb for a jet to be reconstructed as 
6-tagged with given 6-tagging output 3'^°'^ is shown for light, c-quark, and 6-quark jets. In this paper, the output A'^A^ of an 
artificial neural network used in the DO experiment [ld| is taken as a concrete example. The first bin contains jets that fail the 
^-tagging preselection. The structure in the histogram is due to the non-equidistant binning. 



where ef, {4>) is the 5-tagging efficiency for a jet from a parton of flavor 0. 

Typically, 6-tagging algorithms yield a continuous output (for example, the decay length significance of a sec- 
ondary vertex within a jet, or the output of an artificial neural network). Instead of a binary decision, the 
quantity Wb can be parametrized as a function of this continuous value. Such an approach naturally makes 
optimal use of the information. As a compromise, it is possible to use several (rather than only two) bins 
in the output value. This is the concept used in this publication. Figure [3] shows the values Wb for jets in 
lepton+jets ti events as used in the study presented here, which corresponds to the 6-tagging performance of 
the DO experiment [lo| . For the study in this paper, the Wb functions are assumed not to depend on the jet 
transverse momentum or pseudorapidity, but this will be a straightforward extension of the method for future 
measurements. 

• Energetic neutrinos: Neutrinos are not measured in the detector, but still an integration has to be performed 
over assumed values for all momentum components of all final-state neutrinos in an event. Information on 
neutrino momenta can be partly inferred from mass constraints (e.g., mw or mt in tt events). The additional 
assumption is made in this paper that events are balanced in the transverse plane, i.e. that the ti system has 
zero transverse momentum. This assumption is dropped in Section [VII A[ which means that an integration over 
two additional variables has to be carried out. 

The presence of neutrinos in an event is typically inferred from an imbalance in the transverse plane (non- 
zero missing transverse momentum ^t)- It is not straightforward to parametrize the resolutions of the two 
components since they depend on the resolutions of all other reconstructed objects in the event. Instead, the 
vector sum of transverse momenta of all reconstructed objects that are not assigned to the final state in question 
could be considered. In the case of tt events at the Tevatron, this would be calorimeter measurements outside 
of the jets assigned to the tt final state. In this paper, however, as in 8], no corresponding transfer function 
factor is introduced. 

In addition to the detector resolution, one has to take into account the fact that the particles measured in the 
detector cannot be assigned unambiguously to specific final-state particles. Consequently, all possibilities must be 
considered, and their contributions to the transfer function summed. 
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i=l e=l 



The total transfer function can be written as 

"comb "c 

W^(x,y; = E H '^^'HpT " pT'*) x (13) 

"m 

n (^r - ^"'") ((9/PT)r . (9/PT)r) X 

m— 1 



t=i 



i=i 



where the four hues represent the contributions from electrons, muons, tau leptons, and jets, respectively. It is 
understood that a term only appears if the corresponding particle is present in the final state under consideration. 
The number of possible assignments of reconstructed ( "rec" ) particles to final-state particles in the process described 
by the matrix element ("mat") is denoted by ncomb, and i stands for one specific permutation. The symbols rig, 
n^, n,-, and rij stand for the numbers of electrons, muons, tau leptons, and quarks or gluons in the final state. A 
reconstructed particle is denoted by e, m, t, or j. The symbols e', m', t\ and j' stand for the corresponding final-state 
particle assumed in the matrix element integration, which is given by the index i of the permutation and the index of 
the reconstructed particle: e' — e\ ^ (and accordingly for muons, tau leptons, and jets). The flavor 0™^* of final-state 
parton j' assigned to jet j is given by the permutation i. The jet energy scale appropriate for jet j (Sh or Si) is 
denoted by and selected according to the assumed flavour 0™^*. The symbol B^"^ stands for any output from a 
6-tagging algorithm. 

Because of the assumption that the transfer function factorizes into independent contributions from the final- 
state particles, it may in principle also depend on a. For example, in a top quark mass measurement, smaller top 
quark masses correspond to a smaller mean angular separation of jets, which may lead to a broadening of the jet 
energy resolution. Such effects are however typically small and are therefore neglected in the simplified description 
of the detector response with transfer functions; they would implicitly be taken into account in a calibration of the 
measurement with fully simulated events. 

D. Normalization of the Likelihood 

The normalization condition for the likelihood Lp for each process is given by 

Lp{x; dj) Ucc{x) dx = l, (14) 



where the inclusion of the factor /acc(a^) is equivalent to integrating only over those configurations x of observed 
events that pass the event selection criteria. The normalization condition is fulfilled according to the definition of the 
observable cross section ap^^ in Equation The calculation of ap^^ is intimately related with the normalization of 
the transfer function. Both aspects are discussed in this section. 

1. Normalization of the Jet Energy Transfer Function 

The normalization condition of the jet energy transfer function used in previous implementations of the Matrix 
Element method ^, |9|] is given by 

M^jet(£;f ^ Ep\ cj)^'; S^) d^f ^ = 1 . (15) 

We call this a process-based normalization scheme as it reflects the concept that a final-state quark or gluon gives rise 
to a jet of any energy (or is not reconstructed as a jet if the energy is below the jet reconstruction threshold of the 
experiment); thus this normalization scheme does not depend on the event selection cuts. A modified, selection-based 
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FIG. 4: Jet energy transfer function Wj'gt in the modified normalization scheme. Plot (a) shows the transfer functions for 
light-quark jets at = for an assumed value of Si = 1.0 and assumed parton energies of 15GeV (red), 25GeV (blue), and 
35 GeV (green line). Plot (b) shows the same transfer function for different assumed Si values (violet: Si — 0.8, blue: Si = 1.0, 
and cyan: Si = 1.2) and an assumed parton energy of 25 GeV. 



normalization scheme which simplifies the computation of the normalization integral given in Equation ^ but leaves 
the likelihood Lp unchanged is introduced in this section. 

We assume that the event selection requires a reconstructed object for every charged lepton and every quark or 
gluon in the final state (this means for example that the presence of four jets is required for lepton+jets tt events), and 
that the jet selection cuts are identical for all jets. The selection-based normalization of the transfer function is based 
on the concept that events only enter the analysis if all these reconstructed objects pass the corresponding selection 
criteria. This means that every possible partonic final state in the integral in Equation ([S]) is assumed to yield an 
observed event that passed the event selection. Thus, a modified jet energy transfer function W^^^ is introduced in 
the top quark mass measurement, which satisfies the condition 

J T^/et(^r' Ep^cl^p'; S^) di?/- = 1 , (16) 

B"'=>-Bout(|r;j|) 

i.e. the parton under consideration is assumed to have led to a jet that passed the selection cut E^^"^ > Ecut, where the 
energy cut normally depends on the polar angle of the jet since a transverse energy cut is used in the event selection. 
Equation ITBl) ensures that in Equation ([B]), 

Jw'ix,yJ),U,ix)dx = l. (17) 

X 

It is shown in Section HID 31 that the modified denominator a0^^' which is then needed in Equation ([7]) to compute 
the likelihood Lp becomes independent of the parameters /3 that describe the detector response. 

The effect of this selection-based normalization scheme on the jet energy transfer function Wj'^j is shown in Fig- 
ure Sl^a) for the double-Gaussian function used in this study: If the parton energy is assumed to be very small, then a 
small reconstructed jet energy just above the cut value is most likely. However, the function is still normalized to unit 
area as it is assumed that the parton must have given rise to a jet that passed the selection cut (in this example set at 
E > 20 GeV corresponding to rj — for the event selection criteria of Section Hill . The dependence of the jet energy 
transfer function on the parameters /3 that describe the detector response must also be accounted for as explained 
in Figure mjb): For every Si hypothesis the same event sample is considered in the measurement, and therefore the 
event selection (in this example the minimum jet transverse energy cut) cannot depend on the assumed Si value. For 
different assumed values of Si, the W|^^ curve varies, and the normalization of the curve must be adjusted to ensure 
that the normalization condition in Equation (|16p is satisfied. 
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2. Normalization of the Muon and t Transfer Functions 

It is assumed that every r lepton decays to an electron or muon that passed the event selection. The r energy has 
to be larger than the energy of the reconstructed lepton. Consequently, the selection-based normalization condition 
is 

(18) 



-EJ=<^>Scut(|»?f|) 

Because of the non-zero lower integration bound the transfer function WV has to be scaled with an appropriate overall 
factor that depends on the reconstructed lepton energy and pseudorapidity to arrive at the modified function W^. 

In comparison with the jet energy resolution, the muon transverse momentum resolution is good for muons close to 
the minimum transverse momentum cut, and only a negligible fraction of muons is affected by this cut. In addition, 
it is assumed that final-state muons passing the selection cuts are always reconstructed as muons. Thus, a calculation 
of the normalization corresponding to Equation (jl8p can be omitted for muons. 

3. Observable Cross Section 

To derive the denominator al^^^' with which to normalize the likelihood Lp for a given process P, it follows from 
Equations ([Ml), 0, and © that 

dap(pp y; a) W'{x,y; (i) f^cc{x) da; = 1 



^obs' 
"P 



X y 



^ ^-7 / dcrp(pp -^y;d) I W'{x,y; f3) f^cc{x) dx = 1 



obs' 
Up 



V 



^ J dapipp^y; a) - a^^^' , (19) 



where the normalization condition for the modified transfer function W' (Equation pT)) ) has been used in the last 
step. The quantity cr^i^" is thus only a function of the physics parameters a, but not of the detector performance 
parameters /?. 

In the definition of ap^^ , the integral over the observed events x is not over the full phase space, but instead only 
over that part of the phase space that passes the kinematic event selection. Typically, regions of small jet transverse 
energy Et or large \r]\ will be excluded from the integration region. 

Because the normalization of the jet energy transfer function W|^^. described in Section HID II accounts for the 

lower jet Et cut, any jet energy scale dependence of cr^^*^ is eliminated. In contrast, the jet angular resolution is 
approximated with a S function to save integration time, and this means that the integration over y must exclude 
those angular regions that do not pass the event selection. Through the angular acceptance cuts (and through the 
matrix element Aip itself, of course), a'j^^^ still depends on the physics parameters a. A similar argument holds for 
angular acceptance cuts in the charged lepton selection. 

The above argument is only valid if the normalization condition of Equation (|17p is fulfilled for the modified transfer 
function W'. In practice, this is difficult to implement for event selection cuts based on quantities that depend on 
more than one reconstructed particle. For example, the j^t cut in Section IIIII does not fulfill this criterion since 
it depends on all measured final-state particles. Therefore, an Sb dependence of taken into account in the 

analysis of dilepton ti events described in Section I VII For the measurement with lepton-l-jets events, it is shown in 
Section |V] that the Sb and Si dependence of a°^^ can be neglected for the less stringent cut applied in the event 
selection. 



4- Process-Based Normalization Scheme 



It is possible to choose a process-based transfer function normalization according to Equation (|15l) . In this case, 
the dependency of the transfer function normalization on the parameters /? describing the detector resolution is not 
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taken into account. This means that in the last step of the derivation in Equation (IT9|) . a dependency on /3 remains 
and has to vary as a function of both a and /3. 

This process-based scheme has the advantage that the transfer function W accomodates the possibiHty of jets not 
passing the selection cuts. For analyses like mt measurements in tt events as described in this paper, the number 
of jets required in the event selection ensures that in any event where this is the case an additional hard parton 
would have to be produced which yields a jet passing the cuts. In principle, such events have to be described by 
the signal process. At the Tevatron their contribution to the event sample is so small that they do not have to be 
accounted for explicitly in the method until the final calibration step. Consequently, the selection-based transfer 
function normalization described in Sections III D II and III D 21 is chosen for the studies described in this paper, as it 
eliminates the dependency of cr°'^" on the parameters /3 and thus facilitates the simultaneous measurement of several 
parameters. This picture may change for measurements in tt events at the LHC, where initial-state radiation becomes 
much more relevant. 

5. Normalization of the Background Likelihood 

The normalization of the likelihoods can in principle be determined in the same way for all processes considered. 
Alternatively, if the normalization of the likelihood Lp^ for one specific process Pq (for example, the signal process) 
has been determined as described above and if the fraction fp^ of events from that process in the selected sample is 
left as a free parameter in the analysis, it is possible to relate the absolute normalization of the likelihoods for all 
other processes to that of process Pq . One can then make use of the fact that the fit described in Section III El will 
yield a signal fraction fp^ of the sample that is too small if the background likelihood is too large and vice versa, and 
one can adjust the relative normalization in the validation of the Matrix Element method until the signal fraction is 
determined correctly. This concept can only be applied if the cross-section for the process Pq is well-known (like for 
example for ti production). It is then helpful in particular if the likelihoods for background processes as implemented 
in the analysis do not depend on any of the parameters a and /3: In such a case, only one normalization constant 
needs to be determined for each background process. 

E. Fitting Procedure 

For a given sample of selected events, the parameters to be measured are determined as those values that maxi- 
mize the likelihood ^sample- One wants to determine Ua physics parameters, parameters describing the detector 
resolution, and n/p fractions of events from different processes P. For every measured event, the likelihoods for each 
process are calculated for an (ria + n^)-dimensional grid of assumed parameter values. Given these grids of likelihood 
values for each process, the sample likelihood Lsainpic(2;i, .., xat; a, /3, /) defined in Equation ^ is available for an 
(ua + + n/p)-dimensional grid of assumed parameter values. 

The measurement value of a given parameter a and the corresponding uncertainty are then determined from a 
one-dimensional likelihood isampio('^)- The value of -£< sample ('^) obtained by marginalization of all other parameters; 
in practice, this is done by keeping the value of a constant, varying the assumed values of all (ria + np + Ufp — 1) 
other parameters, and taking the maximum Lsampic value. The one-dimensional function — Inigampicl'*) fitted 
with a parabola. The parameter value that minimizes the parabola is taken to be the measurement value, and the 
measurement uncertainty is given by the parameter values where the fitted parabola rises by -|-^ above the minimum. 
By construction, this procedure takes correlations between the parameters into account. 

F. Validation With Ensemble Tests 

To validate the measurement technique, tests are performed with simulated events generated under the assumptions 
used in the Matrix Element method, i. e. using the same PDF set, matrix element, and transfer function. A pseudo- 
experiment emulates a measurement performed on data and consists of events randomly drawn from Monte Carlo 
event pools for signal and background processes. The numbers of events taken from the different pools are chosen 
to reflect the fractions observed in the data. An ensemble of several pseudo-experiments is performed for each of 
a number of sets of assumed input parameter values. The range of assumed values is chosen according to previous 
determinations and the expected precision of the measurement. 

Taking the results from all ensembles, the following information is obtained: 
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• The relation between the expected (mean) measurement values and the corresponding true input values. It is 
expected that the method yields unbiased results if the Matrix Element method reflects the properties of the 
events. 

• The distribution of measurement uncertainties as a function of input parameter values. 

• The width w of the pull distribution. To test that the fitted uncertainties describe the actual measurement 
uncertainty, the deviation of the measurement value from the true value is divided by the fitted measurement 
uncertainty in each pseudo-experiment. The width of this distribution of deviations normalized by the mea- 
surement uncertainties is referred to as pull width. It is expected that it; = 1 if all features of the events are 
accommodated in the method. 

In a similar way, ensemble tests based on fully simulated events can be used to determine any correction of measurement 
values and fit uncertainties needed when applying the method (which is based on a simplified detector model) to real 
data. 

Because the computation of likelihoods is time-consuming, the size of the pools of simulated events is usually limited 
and individual events are allowed to be redrawn, i.e. to appear several times even in the same pseudo-experiment. 
This technique maximizes the information about the expected uncertainties and pull widths, but it has to be taken 
into account when evaluating the uncertainties of the ensemble test results [ll|. 

To summarize, the validation tests described in Sections IVl and IVTl each comprise the three following steps: 

1. Likelihood Fit: build one pseudo-experiment and determine a, /3 and / (cf. Section IIIEp : 

2. Ensemble Test: repeat Step 1 with 1000 pseudo-experiments and obtain mean results, expected uncertainties, 
and pull widths; and 

3. Validation: repeat Step 2 for several input parameter values to obtain calibration curves. 

III. SIMULATION AND SELECTION OF tt EVENTS 

As an example for a concrete implementation of the Matrix Element method described previously, the measurement 
of the top quark mass in lepton-|-jets and dilepton tt events is described in this and the following sections. The 
discussion of dilepton events is restricted to events containing one electronic and one muonic W decay, which yield 
the most precise top quark mass measurement in dilepton events. This section summarizes the generation of smeared 
events to study the top quark mass measurement and introduces the event selection criteria. 

The characteristics of lepton-f jets tt events at a hadron collider are the presence of one energetic isolated charged 
lepton, at least four energetic jets (two of which are fo-quark jets), and missing transverse momentum due to the 
unreconstructed neutrino. The main background is from events where a leptonically decaying W is produced in 
association with four or more jets. Multijet background where one jet mimicks an isolated charged lepton can also 
enter the event sample. 

Dilepton tt events are characterized by two oppositely charged energetic isolated leptons (in the case considered 
here, one electron and one muon), two energetic 6-quark jets, and missing transverse momentum due to the two 
neutrinos from the W decays. The largest physics background in the e/i channel is from events with Z — > t+t^ 
decays where the Z boson is produced in association with two or more jets; another background channel is the 
production of two leptonically decaying W bosons together with two jets. Instrumental background arises from events 
where a leptonically decaying W is produced in association with three or more jets, one of which is misidentified as 
the second isolated charged lepton. 

In principle, events with leptonically decaying r leptons from W decay contribute to both the lepton-|-jets and 
dilepton event samples. However, because of lower transverse momentum or transverse energy cuts on the charged 
lepton(s) in the event selection (see below), these contributions are typically small. Thus, tt events including leptonic 
T decays are not simulated for the study presented here (whereas in a real measurement, the effect of such decays has 
to be accounted for). 

For the study presented here, events containing a qq ^ tt reaction in a pp collision at 1.96 TeV center-of-mass 
energy are simulated with the MADGRAPH [l^ generator. Events are generated for each of the different top quark 
masses, varied between 160 GeV and 180 GeV in steps of 5 GeV. The ALPGEN ^1^ program is used to generate events 
containing a leptonic W or Z decay in association with additional light partons; events with b quarks are simulated 
by smearing the light partons with the transfer function for b jets. To simulate the decay of a t lepton to an electron 
or muon in Z/j*(^ 'T^T^)ji events, the r transfer function shown in Figure [2] is applied, while the direction of the 
lepton is left unchanged. For the modeling of the parton distribution functions, the leading-order PDF CTEQ5L [l3| 
is chosen. Multijet background without leptonic W or Z decay is not simulated because it was shown in 8] that 
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its effect on tfie measurement in tfie lepton+jets channel is similar to that of additional T^+jets background^. All 
simulated events are passed through the parametrized detector simulation discussed in Section fll CI which describes 
the response of the DO detector No simulation of the parton shower and hadronization is performed since the 
transfer functions account for their effects in addition to the detector resolution. Samples with different true values of 
Si are obtained by scaling the smeared light-quark jet energies; values of Si between 0.9 and 1.1 in steps of 0.05 are 
used. Similarly, &-quark jets are scaled by St, with Sb varied between 0.8 and 1.2 in steps of 0.1, to obtain samples 
for different true &-jet energy scales. As the Sb constraint is weaker than the 5*; one a wider range of generated values 
was studied for this observable. The association of final-state partons to jets is not assumed to be known in the 
subsequent analysis. The reconstructed missing transverse momentum fx is taken to be the negative vector sum of 
all other reconstructed transverse momenta (i.e., after the smearing and scaling described above); this means that 
before smearing and scaling the tt system has zero pt- 

Typical event selection criteria as used by the Tevatron experiments are then applied to the smeared events. 
Candidate lepton-|-jets events are required to contain 

• one charged lepton within a pseudorapidity range of \ri\ < 1.1 (electrons) or |?7| < 2.0 (muons) and with a 
transverse energy or momentum of at least 20 GeV, 

• four jets within |?7| < 2.5 and with (scaled) transverse energies of Et > 20 GeV, and 

• missing transverse momentum with magnitude j^t = Ifr \ > 20 GeV. 

The angular separation between the charged lepton and any jet is required to be ATZ = \J {ArjY + {Acjjy > 0.5, and 
similarly, any jet-jet pair has to be separated by ATZ > 1.0. No 6-tagging requirements for the jets are included, but 
^-tagging information is used later in the analysis. 
Similarly, dilepton events must contain 

• one electron and one muon of opposite charges within pseudorapidity ranges of |?7| < 1.1 or 1.5 < |?7| < 2.5 
(electrons)^ or \r]\ < 2.0 (muons) and with a transverse energy or momentum of at least 15 GeV, 

• two jets within |ry| < 2.5 and with (scaled) transverse energies of Et > 20 GeV, and 

• missing transverse momentum with magnitude > 30 GeV. 

The same ATZ separation cuts as above are applied, and in addition the two charged leptons are required to be 
separated by ATZ > 0.5. 

The event samples described here are used for validating the measurement technique as discussed in Sections IVl 
and I VII While the exact event selection criteria are not critical to the method, it is mandatory to adjust the likelihood 
calculation accordingly. 

IV. LIKELIHOOD IMPLEMENTATION FOR MEASUREMENTS IN tt EVENTS 

This section describes the calculation of the signal and background likelihoods for lepton-f jets and dilepton tt events. 
When the likelihood for a certain process has to be evaluated for many hypotheses, a dedicated implementation of the 
matrix element optimized for speed is beneficial, and it is helpful to limit the number of evaluations of the transfer 
function. Section IIV Al discusses the evaluation of the signal ti likelihoods for a top quark mass measurement (in 
the lepton-Hjets or dilepton channel) as an example for such a case. In contrast, when the number of hypotheses is 
smaller and/or there are many individual diagrams contributing to a process, interfacing to routines from existing 
Monte Carlo generators is a powerful solution. Such a case is the evaluation of the background likelihoods for an rrit 
measurement, which is described in Section FlVBI An overview of the event likelihood calculation for the different 
decay channels and processes in a top quark mass measurement is given in Table HI 



channel processes 


likelihoods parameters 


lepton-hiets .... 


Lt-t 
Lwjjjj 


nit, Sb, Si 


dilepton qq — !■ tt 
{en channel) Z/j*{—>- t^t" 


Lt-t 


mt, Sb 



^ In a real measurement, it is thus possible to model both 14^+jets and multijet background by the VF-t-jets process to calculate the 
likelihood Lsamploi and to account for any differences between VF-|-jets and multijet background when calibrating the measurement 
using full simulation. 

^ This cut rejects electrons in the transition region between the barrel and endcap parts of the electromagnetic calorimeter, which has 
poor electron identification performance and is typically located at around 1.1 < \r]\ < 1.5. 
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TABLE I: Overview of the Lcvt calculation in the lepton+jets and dilepton channels. The column entitled "processes" lists the 
signal and background processes taken into account in the calculation of the event likelihood Levt. The symbol "j" refers to 
any light parton, i.e. a u, d, s, or c quark (or antiquark) or a gluon. The rightmost column shows the parameters on which the 
likelihoods Lp for each individual process depend. In principle the background likelihoods depend on Si, but as shown later in 
the paper it is possible to omit this dependence without introducing a significant bias on the rrit measurement. 



A. The Signal Likelihood 

For the calculation of the signal likelihood, the procedure described in M , [l5| has been extended and optimized. It 
is based on the leading-order matrix element for the process qq — >■ tt [l6|7 Aspects that are unchanged from d, [11] 
are only briefly mentioned in the following. The matrix element for the process gg — >■ tt is not evaluated explicitly 
because the top and W propagator and decay parts of the matrix element, which contain most of the information on 
the top quark mass and the separation of signal and background events, are identical. 

The correct association of reconstructed jets with the final-state quarks is not known. Therefore, the transfer 
function takes into account all possible jet-parton assignments as described in Section III CI For a given measured 
event x, the convolution integral in Equation ([5]) is calculated separately for each jet-parton assignment and for 
all different mt assumptions, while all different St and Si hypotheses are considered simultaneously. The integral 
evaluation is performed numerically with the Monte Carlo program VEGAS [13, [3 ^ which has been slightly extended 
to achieve the simultaneous evaluation of several integrals with the same distribution of parton configurations y. A 
single call to the routine calculating the integrand returns an array of values for all assumed Sb and Si values under 
consideration. This diminishes the total computation time spent for a given number of calls to evaluate the integrand, 
because the matrix element does not have to be re-evaluated when only the St or Si assumptions change. Even more 
importantly, fluctuations between the results obtained for different Sb and Si assumptions are reduced because the 
integrand is evaluated for the same parton configurations y. 

While the expected measurement uncertainty on Sb and Si is small relative to the resolution of jet energy mea- 
surements, the current uncertainty on the world average rrit value is of the same order as the top quark width. If 
the range of nit hypotheses to be tested in a measurement spans a range of several times the top quark width, then 
the distribution of parton configurations y at which the integrand is evaluated for one mt value is inappropriate for 
other values, and the technique becomes inefhcient. Thus in the study presented here, the likelihoods for different nit 
assumptions are evaluated independently. 

To evaluate the signal likelihood for an event x and all assumed values of the quantities mt, Sb, and Si to be 
measured, the following computations are performed: 

• Loop over all top quark mass assumptions, 

• loop over all jet-parton assignments, and 

• use the program VEGAS to compute the convolution integral in Equation ([5]) for all Sb and Si hypotheses. 
The integration in Equation ([S]) is over the kinematic variables of the assumed parton configuration, as described 

in Section III Bl The number of dimensions is reduced by assuming perfect measurement of some of the quantities. 
Via variable transformation the remaining integration variables have been chosen such that where possible, they 
are uncorrelated, the integrand exhibits sharp peaks as a function of each individual variable (this optimizes the 
performance of the VEGAS program), and the variable transformation involves at most a quadratic equation (so the 
transformation is fast and numerically stable). The integration variables chosen in the lepton-|-jets and dilepton 
channels are summarized in Table |TT1 The first two rows list variables corresponding to invariant masses and to jet 
momenta, respectively. Other variables are listed in the third row, and the final row indicates the integration necessary 
because of the finite muon momentum resolution. 

The steps to evaluate the integrand for given values of the integration variables are: 

• Determine the momenta of all final-state particles from the values of the integration variables. 

• Evaluate the Jacobian determinant det (J) for the variable transformation. 

• Calculate the value |A^P/pdf('Ci)/pdf('?2) of the matrix element squared times PDF factors, summing over all 
possible initial-state parton species. 

• Then loop over all final-state particles, 

• for each particle, loop over all relevant Sb or Si hypotheses, if applicable, and 

• evaluate the transfer function factor corresponding to that particle. 
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lepton+jets channel 


dilepton channel 




'2 '2 


\Pu\ 


\Pbi\, IP62I 








{i/pt)^. 



TABLE II: Overview of the integration variables for the signal likelihood calculation in the lepton+jets and dilepton channels. 
In the lepton+jets channel, the integration is over the masses of the two top quarks and the hadronically decaying W boson, 
the momentum of the up- type quark from the hadronically decaying W , and the sum of the longitudinal momenta of the 
6-quark from the top quark with the leptonic W decay and the neutrino. In the dilepton channel, the top quark masses, 
6-quark momenta, and the x and y components of the vectorial difference of the two neutrino momenta are taken as integration 
variables. The ratio of muon charge and transverse momentum is a further integration variable where applicable. 



• Return the product 

A(Tp{pp -+ y) W{x, y; Sb, Si) det (J) 

\^ At fa, ^ fa, ^ (27r)4 1 .^p (gifla ^ ^ ) P 
- 2^ d4iCl42 /pDFislj /pDFi«j Yl: "^"/ 

ai,a2 

X W^(x,y; ^6,5/) det(J) (20) 
for all Sh and Si hypotheses. 

The convolution integral in Equation ([5]) has to be calculated for every selected event and is thus the most computing 
intensive part of the analysis. The optimization introduced here allows the integration necessary for the determination 
of three parameters in the lepton+jets channel to be performed within roughly the time needed in for just two 
parameters. 

The normalization a°^^ only has to be determined once for a number of hypotheses relevant to the analysis. This 
is done in a separate program based on the VEGAS package that performs a 16-dimensional Monte Carlo integration 
over the observable final-state phase space. The phase space is generated recursively from the production of the tt 
pair and the subsequent two-body top and W decays. 



B. The Background Likelihood 



There are in general many background processes that can lead to an observed event. It is not problematic per se 
to not fully account for all backgrounds in the event likelihood. An incomplete background likelihood will lead to a 
shift of the measured top quark mass value (apart from an increased statistical uncertainty) ; the shift will in general 
depend on the top quark mass itself and on the fraction of events in the sample that are not accounted for in the 
overall likelihood. The shift is determined in the calibration procedure. When a background term is omitted in the 
event likelihood, the situation will thus be quantitatively, but not qualitatively different from that in an analysis that 
includes this term in the likelihood. 

If several different background processes have similar kinematic characteristics, it is also possible to approximately 
describe the total background by the likelihood for only one of the background processes, multiplied by the total 
background fraction, cf. Equation This technique has been applied by both CDF and DO in the Matrix Element 
analyses in the lepton+jets channel, where a likelihood for QCD multijct production is not explicitly calculated. 

Only leading-order background processes to ti events and only the most important ones among them are considered 
explicitly in this paper. To take into account all individual diagrams, routines from existing Monte Carlo generators 
are used to compute the likelihood for generic processes. They take into account the relative importance of the various 
subprocesses that contribute and perform a statistical sampling of all possible spin, flavor, and color configurations. 
Because the background likelihood does not depend on the top quark mass, it does not have to be computed for 
as many different assumptions as the signal likelihood and it is possible to evaluate the matrix element without a 
dedicated routine optimized for speed. 

The generic background process taken into account in the lepton+jets channel is the production of a leptonically 
decaying W boson in association with four additional light partons, Wjjjj. Events with a leptonically decaying 
W boson and four partons that include heavy-flavor quarks are not considered separately because their kinematic 
characteristics are very similar to those of Wjjjj events. 

The VECBOS [1^ generator is used to calculate the background likelihood Lwjjjj ■ The jet directions and the charged 
lepton are taken as well-measured. The integral in Equation ([S]) is performed by generating Monte Carlo events with 
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quark energies distributed according to the jet transfer function. In these Monte Carlo events, the neutrino transverse 
momentum is given by the condition that the transverse momentum of the W^+jets system be zero, while the invariant 
mass of the charged lepton and neutrino is assumed to be equal to the W mass to obtain the neutrino z momentum 
(both solutions are considered). All 24 possible assignments of jets to quarks in the matrix element are considered 
and their contributions to the likelihood summed. Monte Carlo events are drawn according to the appropriate jet 
resolution functions for the four reconstructed jets, the likelihood Lwjjjj is computed for each of these events, and 
their mean value is used in the subsequent analysis. The study described in SectionlVlsupports that it is not necessary 
to compute the likelihood Lwjjjj for different Si values; only the value Lwjjjj {Si = 1) is used. 

The measurement in the dilepton (e/Lt) channel considers background from events containing a Z boson decaying via 
T leptons to an electron and a muon (plus neutrinos) and two additional light partons explicitly in the event likelihood. 
The likelihood is calculated using the vecbos generator as above, including the transfer function for leptonic r decays 
described in Section lll CI Again, the jet directions and the charged lepton are taken as well-measured, and the integral 
in Equation ([5]) is performed by generating Monte Carlo events with quark energies distributed according to the jet 
transfer function. The energies of the two r leptons are then given by the condition that the transverse momentum of 
the Z+jets system be zero. Both possible assignments of jets to quarks are considered, and as above, only the value 
Lzjj {Si = 1) is used. 

V. APPLICATION OF THE TECHNIQUE TO tt EVENTS IN THE LEPTON+JETS CHANNEL 

The method is validated using smeared parton-level simulated tt and Wjjjj events, generated as described in 
Section Hill In this study, pool sizes of 1500 events for the lepton+jets ti signal process and 850 Wjjjj background 
events are available. In order to model processes not covered by the method, two additional samples are generated. 
Out of the Wjjjj background sample 450 events are modified into Wbbjj events by randomly assigning two light 
partons as b partons and smearing them according to the b quark transfer functions. With this sample, effects of 
heavy flavor content in the background can be studied. The other test sample is composed of 800 lepton+jets tt 
events that contain an additional parton from initial- or final-state radiation. 

As the e-|-jets and fj,+jets decay channels only differ in the momentum resolution of the lepton and whether a 
transfer function is used to parametrize it, no qualitative difference between measurements in the two channels is 
expected. This was verified in 1.20]. Thus, in the following only the e+jets decay channel will be considered. The 
different angular acceptance cuts for electrons and muons lead to different signal fractions for the two channels, but 
the conclusions from the studies described here are still valid since they have been performed for a wide range of 
signal fractions. 

In Section IV Al the method is tested on ensembles containing signal events only. Section IV Bl describes studies 
performed on samples including background events. 

A. Signal-Only Studies 

The method is first tested with ensembles only containing signal events. For these studies, 1000 pseudo-experiments 
are composed of 100 e-|-jets events each, and background likelihoods are not included. The reconstructed fit observables 
should resemble the generated input values within statistical uncertainties and the pull widths are expected to be 
equal to unity (within uncertainties). 

The likelihood normalization cr°> as a function of the top quark mass as defined in Section lll D I is given in Figure [5] 
for the e-l-jets and //-|-jets channels (only the e-Fjets function is further used). These functions are fitted with 3''''-order 
polynomials as a function of the top quark mass. The two channels yield different normalization functions as the 
detector acceptance differs for the two lepton types. 

In Figure [HI results for the measurements of the three fit observables rrit , Sb, and Si can be found. In this and the 
similar figures that follow, the error bars represent the uncertainties arising from limited statistics in the ensemble 
tests. The reconstructed values reproduce the generated ones well, and the deviations between fitted and true values 
are adequately described by the fitted statistical uncertainties. The pull widths in Figure IH^d) are on average two 
standard deviations below the expectation. A similar conclusion cannot be drawn from Figures [BJe) and (f ) since the 
same events are used for all St and Si values except for a rescaling of jet energies. The results show that the method 
works in this test case and that the 6-jet energy scale can be measured together with the top quark mass and light-jet 
energy scale. 

In order to quantify the gain from the inclusion of 6 identification likelihoods as the factor Wb in Equation (|13p . 
the expected statistical uncertainties on the measurement quantities mt, Sb, and Si are depicted in Figure [T] These 
statistical uncertainties correspond to the hypothetical case of an integrated luminosity at the Tevatron of about 
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FIG. 5: Lepton+jets channel: Normalization function a°^^' for the ti likelihood in the lepton+jets channel as a function of the 
top quark mass for the e+jets (blue) and /x+jets channels (red circles). The normalization functions are plotted relative to the 
fitted values for mt — 170 GeV. An overall scale factor is irrelevant for the subsequent analysis because it is absorbed by the 
normalization procedure for the background likelihood. 



0.8 fb^^ (for one experiment and one decay channel) without any background. The solid lines show the expected 
uncertainties obtained when using the full transfer function, while the uncertainties given by the dashed lines are 
obtained when the factor Wb is omitted from the transfer function. For all three quantities an improvement of 
about 15% on the expected relative statistical uncertainty can be achieved because of the additional b identification 
information. No systematic deviation between the measurement values obtained with and without b identification 
likelihoods is observed. The results without Wb are shown for illustration only; in the final studies with lepton+jets 
events the factor Wb is included. 

B. Studies Including Wjjjj and Wbbjj Background 

For the background studies, ensembles are composed of 1000 pseudo-experiments with 200 events each, using 
different signal fractions. The case of a signal fraction /jj = 50 % corresponds to an integrated luminosity at one 
Tevatron experiment of about 0.8 fb~^ (for one decay channel). Two sources of background are studied, Wjjjj and 
Wbbjj events. 

Background from Wjjjj events, containing a leptonically decaying W and four light partons, is described by the 
background likelihood (see Section lIVBp and thus accounted for. To study the dependence of the method on the 
background fraction, it is varied between 0% and 90% in 10% steps. Figure H] shows the results for ensembles with 
true values of = 170 GeV and Sb = Si = 1. The top quark mass fit yields the expected results even for background 
fractions significantly larger than those observed in the data. The two jet energy scales show deviations from the 
expected values when background is included; these deviations increase with the background fraction. This effect 
is not unexpected since the background likelihood is calculated only for the Sb = Si = I hypothesis. However, 
this simplification is appropriate when the goal of the analysis is a measurement of the top quark mass, while the 
determination of the jet energy scales is only performed to reduce the systematic uncertainties on this measurement. 
The lack of an exact modelling of the jet energy scale in background events does not limit the precision of the top 
quark mass determination. 

Background from W^+jets events containing b quarks is topologically very similar to Wjjjj background and is thus 
not treated as a separate process in the likelihood calculation. Nonetheless, if one includes b identification information 
in the analysis, Wbbjj events have to be considered carefully. 

Figure |9] shows the difference between the log-likelihood values InLu and InLwjjjj for tt signal as well as Wjjjj and 
Wbbjj background events. The topological information alone already allows for a discrimination between signal and 
background. But as expected, there is no separation between background without (Wjjjj) and with (Wbbjj) b jets; 
such a separation only arises when b identification information is included. Figure IHJa) is shown for illustration only; 
in the final studies with lepton-|-jets events the factor Wb is included in the transfer function as given in Equation (|13p . 

Ensembles with rrit — 170 GeV and Sb = Si = 1 are created that have a fixed total fraction of background (50%), 
but the fraction of Wbbjj events within this background is varied. This means that an absolute fraction of fivbbn ~ 0-^ 
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FIG. 6: Lepton+jets channel: Measurement of mt, Sb, and Si in pure signal ensembles. Reconstructed ("rec") vs. true values 
are shown in plots (a)-(c), the widths of the pull distributions vs. true values in plots (d)-(f). In plots (a)-(c) the solid lines 
show the results of straight-line fits, while in plots (d)-(f) they indicate the mean values. Event samples with the same nit 
but different St or Si values are correlated since they are obtained by scaling the final-state quark energies as described in 
Section [ml 



corresponds to pseudo-experiments in which the background consists solely of Wbbjj events. Figure [TO] shows the 
results for the three measurement quantities versus the absolute fraction of Wbbjj events. The results indicate 
that there is only a weak dependence of the fit results on the fraction of Wbbjj events for all three fit observables. 
Consequently, only small systematic uncertainties arise in the calibration of the measurement, and it is justified that 
Wbbjj events are not accounted for explicitly in the event likelihood. Note that the results for fv/bln = correspond 
to the values for fwjjjj — 0.5 in Figure [8] and that for Sb and Si, deviations between fitted and true values are not 
unexpected for fwjjjj > as explained above. 

For an integrated luminosity of 12 fb~^, a signal fraction ftt = 50 %, and absolute background fractions of fwjjjj = 
40 % and fy^ijj = 10 %, the expected statistical uncertainties in the lepton-|-jets channel (combining the e-|-jets and 
/i-l-jets channels) obtained by a single Tevatron experiment are found to be 

(lepton-l-jets) = 0.45 GeV, 
CTs, (lepton+jets) = 0.0064, and (21) 
(Ts, (lepton-hjets) = 0.0039. 

The slopes of the calibration curves are between 0.91 and 0.97 and have been accounted for, and the uncertainties 
have been multiplied with the pull widths between 0.94 and 1.04. 

VI. APPLICATION OF THE TECHNIQUE TO ti EVENTS IN THE DILEPTON CHANNEL 

This section describes the application of the Matrix Element method for a simultaneous measurement of the top 
quark mass and the 6-jet energy scale in dilepton tt events in the channel. As in the lepton-|-jets channel, to 
minimize computing time, the parton-level studies described here have been performed assuming perfectly measured 
lepton momenta, i.e. the generated leptons are not smeared, and the additional integration over the inverse muon 
transverse momentum is not carried out. This approach is valid here because the aim is to study the behavior of 
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FIG. 7: Lepton+jets channel: Expected statistical uncertainties of the rrit (circles), Sb (squares), and Si (triangles) determina- 
tion in pseudo-experiments with signal events only. The filled markers are the results with the full transfer function included, 
whereas the open markers lack the Wb factor (6 identification). Solid and dashed lines indicate straight-line fits to the points. 
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FIG. 8: Lepton-|-jets channel: Measurement of (a) mt, (b) Sb, and (c) Si in ensembles including Wjjjj background. Recon- 
structed values are shown as a function of the background fraction. The individual points in each plot are correlated because 
the ensembles are drawn from the same event pools. The lines indicate the results of S'^'^-order polynomial fits to the points. 



the measurement method when apphed to pseudo-experiments with different background compositions, and because 
it has been verified that the conclusion from the parton-level tests stays the same and is not affected by this choice. 
To validate the integration over the inverse muon momentum, an additional test has been performed using smeared 
leptons 

In the dilepton channel, 6-tagging information cannot help to select the correct assignment of jets to partons as in 
the lepton-|-jets case (except in events with significant gluon radiation). Since the background to ti dilepton events is 
small, no 6-tagging information has been used in the studies shown in this section, i.e. the factor Wb has been omitted 
from the transfer function in Equation ([T^. 

As the missing transverse momentum -^t depends on the reconstructed jet energies, the normalization of the signal 
likelihood depends not only on mt, but also on Si,. For a given value of Si,, the normalization is calculated as a 
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FIG. 9: Lepton+jets channel: Difference between tlie log-lilcelihood values InLjf and InLwjjjj for lepton+jets ti signal as 
well as Wjjjj and Wbbjj background events. Each individual distribution is normalized. In plot (a), the Wb factor has been 
omitted in the transfer function, while it is included in plot (b). 
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FIG. 10: Lepton+jets channel: Measurement of (a) nit, (b) 5*6, and (c) Si in ensembles including Wjjjj and Wbbjj background. 
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function of the top quark mass and fitted with a 3'"'^-order polynomial, similar to what is shown in Figure [5] Each 
of the four parameters of the polynomials as a function of Sb are then fitted in turn with a quadratic function. The 
resulting two-dimensional normalization is shown in Figure 1111 The relative normalization of the background and 
signal likelihoods is derived as described in Section HlD 51 

Similar to Section |Vl ensemble tests under different hypotheses are described in the following. 



A. Signal-Only Studies 

In a first step, ensemble tests with pure signal events are performed, and the signal likelihood is taken as the event 
likelihood. For each of nine calibration points in the {mt, Sb) plane, 1000 pseudo-experiments are performed. Each 
of the pseudo-experiments is built of 50 tt events in the e/i channel, corresponding to an integrated luminosity of 
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about 1.4 fb~^ at the Tevatron [22|. At all calibration points, the generated and the measured values of the top quark 
mass and 5-jet energy scale are in excellent agreement. The uncertainty on rrit (Sb) does not depend on Sb {"nit), and 
increases with rrit {Sb) as expected. The pull width is always consistent with unity within uncertainties. 



B. Studies Including Z/7*(— !> r^r )jj and Z/7*(— !> r^r )hh Background 

In the next step, the dominant source of background is added, i.e. Z/^*{-^ t^t^)]] events where the Z boson 
decays into an electron and a muon via two r leptons. Accordingly, the Z/7*(— > T^T^)jj likelihood is included in 
the event likelihood. 

In each of the 1000 pseudo-experiments, 50 events are used, and the fraction of Z/j*{-^ T^T^)jj events is varied 
from 10% to 50% in steps of 10%. Within statistical uncertainties, the measured values of rrit and Sb do not depend on 
the fraction of Z /^*{-^ T~^T~)jj events. Figure [T^ shows the calibration curves for mt and Sb for pseudo-experiments 
containing 30% of Z/7*(— > T^T~)jj events. This fraction corresponds roughly to the total fraction of background 
events selected by the DO experiment in the e/i channel (2^ . Both calibration curves are in excellent agreement with 
the expectation. The pull widths of the mt and Sb measurements are consistent with unity for all ensembles. 

To study the effect of jets from b quarks, Z/j*{-^ T'^T~)bb events are also included in the pseudo-experiments. 
These are described by the Z/j*{-^ T^T~)jj likelihood; no dedicated likelihood for Z/7*(— >■ T~^T~)bb events is 
included. The sum of the fractions of Z/j*{-^ and Z/^*{-^ T^T^)bb events is kept at 30%, and the absolute 

contribution from Z/7*(— )■ T~^T~)bb is varied between 3% and 15% in steps of 3%. Within statistical uncertainties, 
no effect from the jet flavor can be observed on the mean expected measurement values or the widths of the pull 
distributions. 



C. Studies Including Z/^*{-^ t^t )jj and WWjj Events 

Additional contamination of the selected dilepton data sample comes from WWjj events. The expected fraction 
in the efj, channel compared to Z/j*{-^ T^'''~)jj events is about one fourth [22|. A study has been performed where 
each of the 1000 pseudo-experiments is composed on average of 35 signal, 12 Z/7*(— T+r~)jj, and 3 WWjj events. 
The WWjj events are not described by a dedicated likelihood because their contribution to the event sample is small. 

Figure [T^ shows the calibration curves of the top quark mass and the 6-jet energy scale. Their slopes degrade slightly 
to 92 ±6% and 94 ±4%, respectively. The pull widths increase to 1.06 ±0.01 and 1.08 ±0.01. In a measurement, the 
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fitted values have to be adjusted according to the calibration curve, and the Sb uncertainty has to be scaled by the 
pull width. 

Note that in this case it is not expected to obtain perfect calibration curves, because the WW jj background is not 
described with a separate likelihood. This study shows that it is possible to perform the measurement even when a 
background source is not accounted for in the likelihood. 
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FIG. 12: Dilepton channel: Measurement of rrit and Sb in ensembles including 30% of Z/')*(-^ '''^t~)33 background. Recon- 
structed ("rec") vs. true values are shown in plots (a) and (b), pull widths vs. true values in plots (c)-(d). In plots (a) and (b) 
the lines show the results of straight-line fits, while in plots (c) and (d) they indicate the mean values. 

For an integrated luminosity of 12 fb^^, a signal fraction fti = 70%, and absolute background fractions of 
fz/-y*{'^T+T-)jj = 24% and fwwjj =6%, the expected statistical uncertainties in the e/x channel obtained by a 
single Tevatron experiment are found to be 

cr„^(e^) = 2.3 GeV and (22) 
as,{efi) = 0.028. 

The slopes of the calibration curves in Figures I13f a) and (b) have been accounted for, and the uncertainties have 
been multiplied with the pull widths shown in Figures fTHTc) and (d). The statistical correlation between the mt and 
Sb measurements is —55 %. 



VII. SYSTEMATIC UNCERTAINTIES 



For a measurement, uncertainties in the properties of the full simulation (a GEANT-based detector simulation for all 
relevant processes) used for the calibration have to be accounted for by systematic uncertainties on the measurement 
result. These systematic uncertainties are not necessarily equal in magnitude to the corrections derived in the 
calibration. Systematic uncertainties arise from three sources: modeling of the detector performance, uncertainties in 
the method itself, and modeling of the physics processes for ti production and background. 
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FIG. 13: Dilepton channel: Measurement of mt and Sb in ensembles including 24% of Z/7*(— > T^T~)jj and 6% of WWjj 
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In plots (a) and (b) the lines show the results of straight-line fits, while in plots (c) and (d) they indicate the mean values. 



In the first top quark mass measurements, the largest systematic uncertainties related to the detector performance 
originated from the absolute jet energy scales Sb and Si . With the technique described in this paper, these uncertainties 
can be absorbed into the statistical uncertainty. Uncertainties on the top quark mass due to the \ri\ or energy 
dependencies of the jet energy scales or due to other detector effects like energy dependent efficiencies are typically 
much smaller. 

An uncertainty arises from the finite event samples used to calibrate the method, which is reflected in uncertainties 
on the calibration curves shown e.g. in Figures [51 and [T51 These uncertainties can be reduced when larger simulated 
event samples are used. Since all other effects are accounted for by the uncertainties on the properties of the full 
simulation used in the calibration, no additional uncertainties are assigned to the measurement method itself. 

A significant systematic uncertainty in previous top quark mass measurements was due to the uncertainty in 
modeling of initial- and final-state gluon radiation. The most basic uncertainty is related to the overall fraction of 
events with significant radiation. Since the jet energy scales are measured from the data, it can be expected that the 
method is insensitive to the amount of (soft) gluon radiation off the final-state quarks, while knowledge of the amount 
of initial-state radiation is important. Dedicated ensemble tests to study events with significant radiation have been 
performed and are described in the following section. 



A. Studies of tt Events with Initial- and Final-State Radiation 



The model described so far does not account for tt events with an additional hard parton from initial- or final-state 
radiation (ttj). When the tt events are replaced by such ttj events, ensemble tests yield deviations of about 4 GeV 
from the nominal top quark mass in both the lepton-|-jets and dilepton channels. Thus, the method presented so far 
relies on the knowledge of fttj, and an uncertainty on f^^j directly translates into an uncertainty on the top quark 
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mass. 

An ensemble test has been performed with ttj events using an extended model, which has first been described 
in [2l[. The assumption of zero transverse momentum of the tt system is dropped, an additional integration over the 
two transverse momentum components of the tt system is performed in the likelihood calculation, and an additional 
factor Wp^ is introduced which describes the likelihood to obtain a tt system with a given transverse momentum. 
The ensemble test is performed with pseudo-experiments containing 50 dilepton tij signal events. The true top quark 
mass is 170 GeV and the true 6-jet energy scale is 1.0. Figure [Til shows the results of this ensemble test which yields 
expected central measurement values for the top quark mass of 170.6 ± 1.0 GeV and for the 6-jet energy scale of 
1.004 ± 0.013, consistent within uncertainties with the input values. The pull widths are consistent with 1.0 in both 
cases. 




FIG. 14: Dilepton channel: Measurement of rrit and Sb in pseudo-experiments of 50 dilepton ti signal events with non-zero 
ti transverse momentum. True values of rrit = 170 GeV and Sb = 1.0 were used. Shown are the distributions of (a) the 
reconstructed top quark mass and (b) the reconstructed &-jet energy scale. 

This test shows that it is possible to adequatly describe ttj events in the method. To reduce the systematic error 
on the top quark mass that arises from the uncertainty on the fraction of ttj events in the data sample, the method 
should be extended in the future by introducing the fraction of such events as an additional unknown parameter to 
be measured from the data, similar to the parameters Sb and Si. 

VIII. CONCLUSIONS 

The Matrix Element method is a powerful analysis tool that has been applied with great success in measurements 
of the top quark mass, the discovery of electroweak single top quark production, and searches for the Higgs boson. 
In this paper, a detailed introduction into the method is given with the aim of facilitating its application to further 
measurements. The principle of the method is introduced, and details concerning the description of the detector 
response are given. 

It has been proposed previously to overcome the current limitation in top quark mass measurements arising from 
experimental systematic uncertainties by a simultaneous determination of the top quark mass as well as the absolute 
energy scales for both 6-quark and light-quark jets. The paper discusses how this strategy can be implemented 
naturally in the Matrix Element method for measurements in both lepton-|-jets and dilepton events at hadron colliders. 
It is shown that the limiting systematic uncertainty in current measurements (arising from the absolute energy scale 
for 6-quark jets) can be overcome. In the future, it should be possible to render the method stable also against 
systematic uncertainties related to the fraction of events with significant initial- or final-state radiation. 

In conclusion, we have given a general introduction to the Matrix Element method, and we have shown how future 
measurements of the top quark mass can be performed with the Matrix Element method in order to reduce the 
experimental systematic error. 



25 



Acknowledgements 

The authors would hke to thank Gaston Gutierrez and Juan Estrada for their fundamental contributions to the 
development of the Matrix Element method, many of which are part of the foundation for the work presented here. 
Also, the authors would like to thank Raimund Strohmer for his careful reading of the manuscript and his very 
valuable comments, and all their colleagues at the Tevatron experiments DO and CDF for many helpful discussions. 
All authors have previously been employed at Munich University (LMU) , where a substantial part of the work towards 
this paper has been performed, and would like to thank Dorothee Schaile, Otmar Biebel, and all members of the LMU 
experimental particle physics group. 



[1] V. M. Abazov et ai, Nature 429 (2004) 638; 

V. M. Abazov et ai, Phys. Lett. B 617 (2005) 1; 

K. Kondo, J. Phys. Soc. Jpn. 60 (1991) 836; 

R. H. Dalitz and G. R. Goldstein, Phys. Rev. D 45 (1992) 1531. 
[2] A. Abulencia et ai, Phys. Rev. Lett. 99 (2007) 182002; 

T. Aaltonen et at, Phys. Rev. Lett. 102 (2009) 152001; 

V. M. Abazov et ai, Phys. Rev. Lett. 101 (2008) 182001. 
[3] V. M. Abazov et al, Phys. Rev. Lett. 103 (2009) 092001; 

T. Aaltonen et al, arXiv:1004.1181 [hep-ex] (2010). 

[4] F. Fiedler, habihtation thesis at Munich University (2007). larXiv:1003.052"n 
[5] F. Fiedler, Eur. Phys. J. C 53 (2008) 41. 

[6] C. Amsler et al, Phys. Lett. B 667 (2008) 1, and 2009 partial update for the 2010 edition. 
[7] R. Brun and F. Carminati, CERN Programming Library Long Writeup W5013 (1993). 
[8] V. M. Abazov et al, Phys. Rev. D 74 (2006) 092005. 
[9] T. Aaltonen et al, Phys. Rev. D 79 (2009) 072001. 
[10] V. M. Abazov et ai, Phys. Rev. D 75 (2007) 092007. 

[11] R. Barlow, "Application of the bootstrap resampling technique to particle physics experiments," MAN/HEP/99/4 (2000), 

|http : / / www . hep . man . ac . uk/ preprint s/manhep9 9-4 . ps | 
[12] F. Maltoni and T. Stelzer, JHEP 0302 (2003) 0277" 

[13] M. L. Mangano, M. Moretti, F. Piccinini, R. Pittau and A. D. Polosa, JHEP 0307 (2003) 001. 
[14] H. L. Lai et al, Eur. Phys. J. C 12 (2000) 375. 

[15] P. Schieferdecker, PhD thesis at Munich University (2005), FERMILAB-THESIS-2005-46. 
[16] G. Mahlon and S. J. Parke, Phys. Lett. B 411 (1997) 173. 
[17] G. P. Lepage, J. Comput. Phys. 27 (1978) 192. 
[18] G. P. Lepage, Cornell preprint CLNS;80-447 (1980). 

[19] F. A. Berends, H. Kuijf, B. Tausk and W. T. Giele, Nucl. Phys. B 357 (1991) 32. 
[20] P. Haefner, PhD thesis at Munich University (2008), FERMILAB-THESIS-2008-51. 
[21] A. Grohsjean, PhD thesis at Munich University (2008), FERMILAB-THESIS-2008-92. 
[22] V. M. Abazov et al, Phys. Lett. B 679 (2009) 177. 



